30 research outputs found

    Shot-based object retrieval from video with compressed Fisher vectors

    Get PDF
    This paper addresses the problem of retrieving those shots from a database of video sequences that match a query image. Existing architectures are mainly based on Bag of Words model, which consists in matching the query image with a high-level representation of local features extracted from the video database. Such architectures lack however the capability to scale up to very large databases. Recently, Fisher Vectors showed promising results in large scale image retrieval problems, but it is still not clear how they can be best exploited in video-related applications. In our work, we use compressed Fisher Vectors to represent the video-shots and we show that inherent correlation between video-frames can be proficiently exploited. Experiments show that our proposal enables better performance for lower computational requirements than similar architectures

    Learnable Descriptors for Visual Search

    Get PDF
    This work proposes LDVS, a learnable binary local descriptor devised for matching natural images within the MPEG CDVS framework. LDVS descriptors are learned so that they can be sign-quantized and compared using the Hamming distance. The underlying convolutional architecture enjoys a moderate parameters count for operations on mobile devices. Our experiments show that LDVS descriptors perform favorably over comparable learned binary descriptors at patch matching on two different datasets. A complete pair-wise image matching pipeline is then designed around LDVS descriptors, integrating them in the reference CDVS evaluation framework. Experiments show that LDVS descriptors outperform the compressed CDVS SIFT-like descriptors at pair-wise image matching over the challenging CDVS image dataset

    Capsule Networks with Routing Annealing

    Get PDF
    International audienc

    HEMP: High-order entropy minimization for neural network compression

    Get PDF
    We formulate the entropy of a quantized artificial neural network as a differentiable function that can be plugged as a regularization term into the cost function minimized by gradient descent. Our formulation scales efficiently beyond the first order and is agnostic of the quantization scheme. The network can then be trained to minimize the entropy of the quantized parameters, so that they can be optimally compressed via entropy coding. We experiment with our entropy formulation at quantizing and compressing well-known network architectures over multiple datasets. Our approach compares favorably over similar methods, enjoying the benefits of higher order entropy estimate, showing flexibility towards non-uniform quantization (we use Lloyd-max quantization), scalability towards any entropy order to be minimized and efficiency in terms of compression. We show that HEMP is able to work in synergy with other approaches aiming at pruning or quantizing the model itself, delivering significant benefits in terms of storage size compressibility without harming the model's performance
    corecore